Evaluating the Variance of Likelihood-Ratio Gradient Estimators
نویسندگان
چکیده
The likelihood-ratio method is often used to estimate gradients of stochastic computations, for which baselines are required to reduce the estimation variance. Many types of baselines have been proposed, although their degree of optimality is not well understood. In this study, we establish a novel framework of gradient estimation that includes most of the common gradient estimators as special cases. The framework gives a natural derivation of the optimal estimator that can be interpreted as a special case of the likelihood-ratio method so that we can evaluate the optimal degree of practical techniques with it. It bridges the likelihood-ratio method and the reparameterization trick while still supporting discrete variables. It is derived from the exchange property of the differentiation and integration. To be more specific, it is derived by the reparameterization trick and local marginalization analogous to the local expectation gradient. We evaluate various baselines and the optimal estimator for variational learning and show that the performance of the modern estimators is close to the optimal estimator.
منابع مشابه
The Ratio-type Estimators of Variance with Minimum Average Square Error
The ratio-type estimators have been introduced for estimating the mean and total population, but in recent years based on the ratio methods several estimators for population variance have been proposed. In this paper two families of estimators have been suggested and their approximation mean square error (MSE) have been developed. In addition, the efficiency of these variance estimators are com...
متن کاملEfficient Design and Sensitivity Analysis of Control Charts Using Monte Carlo Simulation
The design of control charts in statistical quality control addresses the optimal selection of the design parameters such as the sampling frequency and the control limits; and includes sensitivity analysis with respect to system parameters such as the various process parameters and the economic costs of sampling. The advent of more complicated control chart schemes has necessitated the use of M...
متن کاملRenewal Monte Carlo: Renewal theory based reinforcement learning
In this paper, we present an online reinforcement learning algorithm, called Renewal Monte Carlo (RMC), for infinite horizon Markov decision processes with a designated start state. RMC is a Monte Carlo algorithm and retains the advantages of Monte Carlo methods including low bias, simplicity, and ease of implementation while, at the same time, circumvents their key drawbacks of high variance a...
متن کاملOn using generalized Cramér-Rao inequality to REML estimation in linear models
The main aim of considerations in the problem of estimation of variance components σ 1 and σ 2 2 by using the ML-method and REML-method in normal mixed linear model N{Y, E(Y ) = Xβ, Cov(Y ) = σ 1V + σ 2In} was concerned in the examination of theirs efficiency. It is particularly important when an explicit form of these estimators is unknown and we search for the solutions of the likelihood equa...
متن کاملPitman-Closeness of Preliminary Test and Some Classical Estimators Based on Records from Two-Parameter Exponential Distribution
In this paper, we study the performance of estimators of parametersof two-parameter exponential distribution based on upper records. The generalized likelihood ratio (GLR) test was used to generate preliminary test estimator (PTE) for both parameters. We have compared the proposed estimator with maximum likelihood (ML) and unbiased estimators (UE) under mean-squared error (MSE) and Pitman me...
متن کامل